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The sum of active neutrino masses is well constrained, 58meV < nii, < 0.23 eV, but the origin of 
this scale is not well understood. Here we investigate the possibility that it arises by environmental 
selection in a large landscape of vacua. Earlier work had noted the detrimental effects of neutrinos on 
large scale structure. However, using Boltzmann codes to compute the smoothed density contrast on 
Mpc scales, we find that dark matter halos form abundantly for > 10 eV. This finding rules out 
an anthropic origin of unless a different catastrophic boundary can be identified. Here we argue 
that galaxy formation becomes inefficient for niv > 10 eV. We show that in this regime, structure 
forms late and is dominated by cluster scales, as in a top-down scenario. This is catastrophic: 
baryonic gas will cool too slowly to form stars in an abundance comparable to our universe. With 
this novel cooling boundary, we find that the anthropic prediction for nii, agrees at better than 2a 
with current observational bounds. A degenerate hierarchy is mildly preferred. 


I. INTRODUCTION 

In a theory with a large multidimensional potential 
landscape [1] , the smallness of the cosmological constant 
can be anthropically explained m The lack of a 
viable alternative explanation for a small or vanishing 
cosmological constant, the increasing evidence for a fine- 
tuned weak scale, and several other complexity-favoring 
coincidences and tunings in cosmology and the Standard 
Model, all motivate us to consider landscape models 
seriously, and to extract further pre- or post-dictions 
from them. 

A large landscape can also explain an aspect of the 
Standard Model that has long remained mysterious: the 
origin of the masses and mixing angles of the quarks and 
leptons. Plausible landscape models allow for some of the 
first generation quark and lepton masses to be anthrop¬ 
ically determined, while the remaining parameters are 
set purely by the statistical distribution of the Yukawa 
matrices. Results are consistent with the observed hier¬ 
archical, generation, and pairing structures [53H3S]- In 
such analyses, the overall mass scale of neutrinos may be 
held fixed and ascribed, e.g., to a seesaw mechanism. But 
ultimately, one expects that the mass scale will vary, no 
matter what the dominant origin of neutrino masses is 
in the landscape. For Dirac neutrinos, Yukawa couplings 
can vary; in the seesaw, a coupling or the right-handed 
neutrino mass scale can vary. 


^ It cannot be explained in a one-dimensional landscape, no mat¬ 
ter how large [ 311 , because an empty univere is produced. The 
string theory landscape □iniii] is an example of a multidimen¬ 
sional landscape in which the cosmological constant scans densely 
and our vacuum can be produced with sufficient free energy. Re¬ 
lated early work includes EM]. Reviews with varying ranges of 
detail and technicality are available, for example I22H28I . 


Thus we may ask whether anthropic constraints play 
a role in determining the overall scale, or sum, of the 
standard model neutrino masses, 

= m^Ve) + + m{i>r) . ( 1 ) 

Current observational bounds imply 

58meV < TOi. < 0.23eV . (2) 

The lower bound comes from the mass splittings observed 
via solar and atmospheric neutrino oscillations |37j . The 
upper bound comes from cosmological observations that 
have excluded the effects that more massive neutrinos 
would have had on the cosmic microwave background 
and on large scale structure [351ISH] ■ The proximity of 
the lower to the upper bound gives us confidence that 
cosmological experiments in the coming decade will de¬ 
tect mu, and that they may determine its value with a 
precision approaching the 10“^ eV level |3D] . 

An anthropic origin of the neutrino mass scale is sug¬ 
gested by the remarkable coincidence that neutrinos have 
affected cosmology just enough for their effects to be no¬ 
ticeable, but not enough to signihcantly diminish the 
abundance of galaxies. A priori, could range over 
dozens of orders of magnitude. If m^ was only two or¬ 
ders of magnitude smaller than the observed value, its 
effects on cosmology would be hard to discern at all. If 
mu was slightly larger, fewer galaxies would form, and 
hence fewer observers like us. The goal of this paper is 
to assess this question quantitatively. 

The basic framework for computing probabilities in 
a large landscape of vacua is reviewed in Sec. jlTj and 
the probability distribution dP/dlogmu is computed 
in Sec. |III[ In the remainder of this introduction, we 
will describe the key physical effects that enter into the 
analysis, and we will present our main results. 

Summary: There are two competing effects that de¬ 

termine the neutrino mass sum. We assume that the sta¬ 
tistical distribution of neutrino masses among the vacua 
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FIG. 1: (a) The dimensionless power spectrum at 2 = 0 for a range of neutrino masses with a normal hierarchy, computed using 
CAMB. From top to bottom at high wavenumber: rrii, = 0 (red), = 5 eV (purple), = fO eV (blue), and = 15 eV 
(black). Free-streaming of massive neutrinos causes a suppression of power at high wavenumbers. Above a critical neutrino 
mass m ~ 8 eV, this effect is large enough so that the dimensionless power spectrum develops a peak near the free-streaming 
scale fcnr < fcgai. This implies that the first structure consists of cluster-size halos, (b) We obtain the smoothed density contrast 
aR at 2 = 0 numerically, essentially by integrating k^P(k) up to the wavenumber 1/R', see Eq. ([^ and surrounding discussion. 
We take R to be the comoving scale of the Milky Way (IO^^Mq). The orange (upper) curve corresponds to a normal hierarchy; 
the green (lower) to a degenerate hierarchy. We see that neutrinos suppress halo formation only in the regime rrii, < fO eV where 
the dimensionless power spectrum has no maximum and the integral is dominated by the large-fc cutoff. For larger neutrino 
masses, the formation of galactic and larger halos is actually enhanced, because the dimensionless power spectrum develops a 
peak that dominates the integral. At higher neutrino masses, the peak power increases, due to a decrease of the free-streaming 
scale and a lengthening of the matter era. (This is more pronounced for a normal hierarchy.) Hence or increases. If observers 
formed in proportion to the mass fraction in large dark matter halos, this would rule out an anthropic origin of mi^; see Fig.[^ 


of the landscape favors a large neutrino mass sum, with a 
force of order unity or less (see Sec. IIA for the definition 
of the multiverse force). 

If the anthropic approach is successful, we must 
demonstrate a compensating effect: that neutrino masses 
much greater than the observed value are not frequently 
observed. That is, we must multiply the prior probabil¬ 
ity for some value of rrii, by the number of observers that 
will be produced in regions where mu takes this value. 
Observers are usually represented by some proxy such as 
galaxies. We consider two models for observations: at 
any given time, their rate is proportional to the number 
of Milky Way-like galaxies, or proportional to the growth 
rate of this galaxy population (see Sec. IIB). We sum this 
rate over a spacetime region called the causal patch [IT] 
(a standard regulator for the divergent spacetime t hat re - 
sults from a positive cosmological constant; see Sec. IIC). 


The product of prior distribution and the abundance of 
galaxies yields a predicted probability distribution. As 
usual, if the observed value lies some number of standard 
deviations from the mean of the predicted distribution, 
we reject the model (in this case the anthropic approach 
to mi,) at the corresponding level of confidence. 

The neutrino mass spectrum—the individual distribu¬ 
tion of masses among the three active neutrinos—has a 
noticeable effect on structure formation. We consider two 
extreme cases. In the normal hierarchy^ one neutrino 
contributes dominantly to the mass sum m „; here we ap¬ 
proximate the remaining two as massless. In this case the 
observed mass splittings require m^ > 58 meV. In the de¬ 
generate hierarchy, each mass is of order mujZ (and here 


we approximate them as exactly equal). This case will 
soon be tested by cosmological observations, since the 
observed mass splittings would imply m,, > 150 meV, 
near the present upper limit. We do not explicitly con¬ 
sider the intermediate case of an inverted hierarchy, with 
two nearly degenerate massive neutrinos and one light or 
massless neutrino. 

The main challenge lies in estimating the galaxy abun¬ 
dance as a function of The effects of one or more 
massive neutrinos on structure formation are somewhat 
complex; hence, we compute the linear evolution of den¬ 
sity perturbations numerically using Boltzmann codes 
CAMB |32| and CLASS [33], wherever possible. We will 
now summarize the key physical effects. A more exten¬ 
sive summary and analytic approximations are given in 
Sec. Ill and Appendix we recommend Refs. [331 |3S] 
for detailed study. 

After becoming nonrelativistic, neutrinos contribute 
approximately as pressureless matter to the Friedmann 
equation. However, they contribute very differently from 
cold dark matter (CDM) to the growth of perturbations, 
because neutrinos are light and move fast. This intro¬ 
duces a new physical scale into the problem of structure 
formation: the free-streaming scale is set by the distance 
over which neutrinos travel until becoming nonrelativis¬ 
tic. It is roughly given by the horizon scale when they 
become nonrelativistic, with comoving wavenumber km- 
(see Appendix B 1 for more details). On this and smaller 


scales, k > k^r, neutrinos wipe out their own density per¬ 
turbations. More importantly, as a nonclustering matter 
component they change the rate at which CDM perturba- 
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FIG. 2: No cooling boundary: if one assumed that observers 
trace dark matter halos of mass or greater, one would 

find a bimodal probability distribution over the neutrino mass 
sum nil,. This distribution is shown here for a normal hier¬ 
archy (orange/upper curve) and degenerate hierarchy (green 
curve). The range of values m,, consistent with observation 
(58 meV < m,, < 0.23 eV, shaded in red) is greatly disfa¬ 
vored, ruling out this model.—By contrast, we shall assume 
here that observers trace galaxies. Crucially, we shall argue 
that for niu > 10 eV, galaxies do not form even though halos 
do. This novel catastrophic boundary excludes the mass range 
above 10 eV, leading to a successful anthropic explanation of 
the neutrino mass (see Fig. [^. 

tions grow, from linear growth in the scale factor (5 cx a) 
on large scales, to sub-linear growth on smaller scales 
k > fcnr- This suppresses the CDM power spectrum at 
small scales, see Fig. [T}t. 

The linear quantity most closely related to the abun¬ 
dance of dark matter halos on the galactic scale R is 
not the dimensionless CDM power spectrum k^Pcc{k). 
Rather, halo abundance is controlled by the smoothed 
density contrast cr^j, which is approximately given by the 
integrated power, 

ctr^ T 

up to the wavenumber corresponding to the relevant 
scale. (A more precise formula is given in the main text, 
where we also describe in detail how halo abundance is 
computed from aR using the Press-Schechter formalism.) 
This distinction turns out to be crucial for large neutrino 
masses. 

We see from Fig. that for small neutrino masses 
nil, ^ 10 eV, the integrand k^Pcdk) increases monotoni- 
cally. Hence the integral for ur is dominated by its upper 
limit, i.e., by the power on the scale A:gai ^ 1/R. This 
yields the “bottom-up” scenario of hierarchical struc¬ 
ture formation familiar from our own universe: small ha¬ 
los typically form first, and more massive halos virialize 
later. 

However, for nii, > 10 eV, the small scale power be¬ 
comes so suppressed that the dimensionless power spec¬ 
trum develops a maximum at the free-streaming scale 


fcnr < fcgai- In this regime, the smoothed density con¬ 
trast aR on galactic scales R is no longer dominated by 
the power at wavenumber ~ l/R. Instead, the power at 
larger scales than R contributes dominantly to aR. This 
results in a top-down scenario, where halos first form on 
cluster scales, nearly simultaneously with galactic-scale 
halos. 

The transition from bottom-up to top-down structure 
formation around « 10 eV has not (to our knowledge) 
been noted in the context of anthropic explanations of the 
neutrino mass sum. We find here that it is crucial to the 
analysis, for two reasons. First, it implies that the scales 
that dominantly contribute to aR are unaffected by free- 
streaming for mjy > 10 eV. Therefore, increasing mu be¬ 
yond ~ 10 eV does not suppress CDM structure. In fact, 
we find that aR increases in this range (Fig. &)■ The sec¬ 
ond implication works in the opposite direction: in the 
top-down scenario that arises for > 10 eV, galaxies 
will not form inside halos at an abundance comparable to 
our universe. 

Let us discuss each of these implications in turn. We 
begin by pretending that the stellar mass per halo mass is 
unaffected by in particular, let us suppose that there 
is no dramatic suppression of star formation in the top- 
down regime, > 10 eV. If so, we would be justified in 
regarding halos as a fair proxy for observers. Here we con¬ 
sider halos [15]. From Fig. [^, we see that halo 

abundance decreases with up to ^ 10 eV; then 
it begins to increase. Combining this with the assumed 
prior distribution that favors large m^, we would find 
the probability distribution over is bimodal (Fig. [^. 
The first peak is at « 1 eV, followed by a mini¬ 
mum near 10 eV and a second peak at much greater 
mass.^ Therefore, if observers traced dark matter halos 
with M > IO^^Mq, one should conclude that small neu¬ 
trino masses are greatly disfavored. Such a result would 
be in significant tension with the current upper bound of 
0.23 eV, and it would seem to render an anthropic origin 
of the neutrino mass sum implausible. 

However, our fundamental assumption is that ob¬ 
servers trace galaxies, not halos. In some cosmologies 
including our own, galaxies in turn trace halos; if they 
do, halos are an equally good proxy. But the change of 
regime from bottom-up to top-down structure formation 
for mu > 10 eV is catastrophic for galaxy formation. 

From observation, we know that stars do not form effi¬ 
ciently in bound structures that are much larger than the 
mass scale of our own galactic halo, lO^^M©. Heuristi- 


2 Fig. i does not show the entire peak since CAMB gives results 
only for mj, < 40 eV. Absent the earlier catastrophic boundary 
at 10 eV that we will assert, a robust effect that would eventu¬ 
ally suppress the probability at large neutrino mass is the small¬ 
ness of the baryon fraction for rai, > 100 eV. This would sup¬ 
press the number of baryons (and hence, observers) in the causal 
patch 07). It would also impose dynamical obstructions to star 
formation m- 
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FIG. 3: Our main result: the probability distribution over the neutrino mass sum for (a) a normal hierarchy and (b) a 
degenerate hierarchy, assuming that observers require galaxies. The plot is the same as Fig. but the mass range is cut off 
at 7.7eV (10.8 eV) in the normal (degenerate) case. For greater masses, the first halos form late and are of cluster size; we 
argue that galaxies do not form efficiently in such halos. We use the second observer model described in Sec. m results 
look nearly identical with the first model. We assume a flat prior over rrii, (see Fig. for other priors).—The central la and 
2a regions are shaded. Vertical red lines indicate the lowest possible values for the neutrino mass consistent with available 
data: mobs ~ 58 meV for a normal hierarchy, and mobs ~ 150 meV for a degenerate hierarchy. We find that these values are 
within 2a of the median. The agreement would further improve with a less conservative treatment of the detrimental effects of 
neutrinos on gas cooling in halos, and/or the cosmological detection of a neutrino mass sum larger than the minimal value. 


cally, this can be explained by noting that in halos of this 
size, the cooling timescale for the baryonic gas is greater 
than the age of the universe [48l - f5^ . In our universe there 
are galaxies because galactic halos, which produce stars 
efficiently, formed earlier than these larger halos, which 
do not. Clusters inherit galaxies that formed in smaller 
halos, but they do not have significant star formation 
themselves. 


In a top-down scenario due to large neutrino mass, 
however, galactic halos would form much later. They 
would typically be embedded in larger halos that virialize 
roughly at the same time, with masses characteristic of 
galaxy groups clusters—but without many galaxies to 
inherit. The virial temperature and dynamical timescale 
relevant for baryon cooling will be set by the largest of 
the nested halos. (See Sec. HID and Appendix [C] for 
details.) Therefore, cooling will not be efficient: the top- 
down scenario produces star-poor dark matter clumps, 
with most baryons remaining in hot gas. 

As a first approximation for this cooling boundary, we 
cut off the probability distribution at a value rui, ~ 10 eV 
that corresponds to the onset of the top-down regime. 
This overestimates the amount of galaxies just below the 
cutoff and underestimates it just above. In future work, 
we plan to include explicit models for successful galaxy 
cooling beyond the crude top-down vs. bottom-up crite¬ 
rion. This should replace the sharp cutoff by a smooth 
decay of the probability. 

We believe that our argument for a cooling catas¬ 
trophe is robust, because the transition to a top-down 
scenario is a drastic change of regime. However, the 
underlying physics is complicated, involving shocks, 
complicated cooling functions, fragmentation, and feed¬ 
back from stars, black holes, and supernovae. Suppose 
therefore that we are wrong. That is, suppose that 


at nriy > 10 eV, some unanticipated combination of 
processes lead to a stellar mass inside the causal patch 
that is not much less than in our universe. Then one 
would find that large neutrino masses are unsuppressed 
(Fig. [^, and the observed value of cannot be 
explained anthropically. In this sense, the cooling 
catastrophe we assert can be regarded as a prediction 
of the anthropic approach to the neutrino mass. To 
test this prediction, it will be important to investigate 
galaxy formation for mv > 10 eV using simulations that 
give an adequate treatment of cooling flows and feedback. 

Results: Our main results, with the cooling cutoff 

TTiv ^ 10 eV imposed, are shown in Fig. We hnd that 
the currently allowed range of values for is entirely 
consistent with an anthropic explanation, at better than 
2(7. Fig. 1^ shows that that our approach succeeds for a 
wide range of prior distributions dVvac/dmi, cx as¬ 

suming a normal (degenerate) hierarchy, mobs lies within 
2(7 of the median if 0.09 < n < 1.0 (0.09 < n < 1.4). 

Our chief conclusion is that the neutrino mass sum can 
be anthropically explained, but only if detrimental effects 
of neutrinos on galaxy and star formation (rather than 
halo formation) already become significant at or below 
m^ « 10 eV. 

Our results favor larger roi, than the minimum values 
allowed by the observed mass splittings, and in particular 
they favor a degenerate over a normal hierarchy. Since 
the observed range is consistent within 2a in either case, 
these are mild preferences rather than sharp predictions. 

There are however two additional reasons why a de¬ 
generate hierarchy appears more natural in the context 
of the anthropic approach. First, with a normal hier¬ 
archy one might expect that each neutrino mass scans 
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FIG. 4: The prior distribution of cosmologically produced vacua is assumed to favor large neutrino mass and to have no special 
feature near the observed magnitude: d'Pvac/dlogm^ oc m", n > 0. One then expects n ~ 0(1), and the previous two figures 
all show the case n = 1. This figure shows that the same conclusions obtain for a considerable range of n. (a) The median 
of the probability distribution as a function of the multiverse force n. (b) The standard deviation of the worst case observed 
value from the median, as a function of n. The 2a region is shaded. Orange (the upper curve at large n in each plot) is for a 
normal hierarchy; green is for a degenerate hierarchy. 


separately with prior Ui. Each prior would have to be 
assumed positive and 0(1). The prior for rrii, would then 
be n = ^ Hi, and it becomes less plausible that n should 
be small enough to render the anthropic prediction com¬ 
patible with observation. A degenerate hierarchy, on the 
other hand, may be the result of some flavor symmetry 
that links the masses of the individual neutrinos, leaving 
only a single scanning parameter. Then it is more plau¬ 
sible that n is small enough to include the observed my. 

The second reason to prefer a degenerate hierarchy 
is that it eliminates a viable anthropic window where 
two neutrinos are extremely massive. If each neutrino 
has mass of order MeV or greater, neutrons would be 
stable, leading to a (catastrophic) helium-dominated 
universe [53| . But neutrons will be unstable and the 
catastrophe is averted, if one neutrino remains light and 
only the other two become very heavy. With a normal 
or inverted hierarchy, one has to explain why the one 
or two heavy neutrinos did not end up in the extremely 
large mass range above the MeV scale. This can be 
resolved by assuming that the prior distributions for the 
individual neutrino masses do have a feature between 
the eV and the MeV scale, such that the much larger 
scale is disfavored. With a degenerate hierarchy, this 
problem does not arise in the first place, since either all 
neutrinos are light or all are heavy. 

Relation to earlier work: Our analysis builds on the 
pioneering work of Tegmark, Vilenkin and Pogosian [531 
[5i] (see also [551 liS])? who were the first to argue that 
the neutrino mass admits an anthropic explanation. We 
agree with their conclusion, but we claim here that the 
nature of the relevant catastrophic boundary was not cor¬ 
rectly identified. 

Ref. [55] does not justify its restriction to the region 
my ^ 10 eV. Moreover, it employs an analytic approx¬ 
imation to (Tn that greatly underestimates the halo 


abundance for my >5 — 10 eV. With this approximation, 
the probability distribution appears to vanish near 
10 eV due to a paucity of CDM structure; see Fig. [^ 
Thus, suppression of CDM structure due to massive 
neutrinos—rather than the obstruction to cooling at 
m > 10 eV—would appear to provide the relevant catas¬ 
trophic boundary underlying the anthropic explanation 
of the neutrino mass sum. 

Here we go further in two respects: our numerical 
computations show that CDM structure becomes unsup¬ 
pressed for my > 10 eV. Hence, if neutrino masses have 
an anthropic origin, a different catastrophic boundary 
is relevant. And we identify a specific physical effect, 
the transition to a top-down regime, which had not been 
noted and which supplies a suitable boundary by sup¬ 
pressing galaxy formation. 

The analytic approximation in question is Eq. (5) in 
Ref. [55] . It assumes that massive neutrinos suppress the 
smoothed density contrast aji on galactic scales by the 
same factor by which they suppress the matter power 
on galactic scales. This is accurate for small neutrino 
masses, because in a bottom-up scenario the shortest 
scales dominate the integral for an. The approximation 
underestimates the abundance of dark matter halos for 
nT-y > 10 eV, because in this regime an is dominated by 
power at larger scales, which is relatively unsuppressed by 
free-streaming. More details can be found after Eq. (§ 
and in Sec. ElDj 

The discrepancy is revealed by explicit numerical com¬ 
putation of the smoothed density contrast on galactic 
scales from Boltzmann codes (see Fig. [^. One also finds 
significantly different results for a normal versus degen¬ 
erate hierarchy, a distinction that was suppressed in the 
analytical approximation of Ref. |53j . 

When the halo abundance is correctly computed, the 
need for a novel catastrophic boundary at or before 10 eV 
becomes evident (Fig. [^. Without it, the probabil- 
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FIG. 5: The dashed black line shows the probability distri¬ 
bution found by Tegmark et al. [S3] for a flat prior over 
This result would seem to remain compatible at about 2a 
with current observational constraints (red shaded region). 
However, the analytic fitting function for the density con¬ 
trast an used in |53l [54| underestimates gr above a few eV. 
The solid curves show the probability distribution that results 
when gr is computed numerically from Boltzmann codes: or- 
ange/upper=normal hierarchy; green/lower=degenerate hier¬ 
archy. They differ slightly from Fig. [^because Ref. [53] used 
a different measure and observer model. Either way, a suc¬ 
cessful anthropic explanation of rtiv requires the identification 
of a catastrophic boundary at or below 10 eV. 


ity distribution would strongly disfavor small neutrino 
masses. It would be in significant tension with the an 
upper bound of 0.23 eV or even 1 eV, and it would seem 
to render an anthropic origin of the neutrino mass sum 
implausible. 

Our computation of gr from Boltzmann codes, and 
our compensating identihcation of a novel catastrophic 
boundary at 10 eV are the main differences to Ref. |53|. 
Another difference is that we use the causal patch mea¬ 
sure m to regulate the infinities of eternal inflation. 
Refs. |S31 El] used a different measure that is no longer 
viable; see Sec. jll Cj for details. This has a visible but 
comparatively small effect on the probability distribu¬ 
tion: by comparing Fig. [^with Fig. one sees that the 
causal patch is somewhat more favorable to an anthropic 
explanation of m^. The causal patch also renders more 
robust our conjecture that star formation is ineffective 
for rrii, > 10 eV, as discussed in more detail later. 

We also build on the seminal investigation of catas¬ 
trophic boundaries in cosmology by Tegmark et al. [48] 
(see also Ref. [S7j), who emphasized the crucial role of 
cooling. We believe that our present work is the first 
to associate catastrophic cooling failure to a top-down 
structure formation scenario. Ref. [53j points out a 
number of distinct catastrophies at very large neutrino 
mass: For example, neutrinos act as cold dark matter 
for rrii, 3> 100 eV, which also may be detrimental to star 
formation. (However, this does not counteract the abun¬ 
dance of CDM structure we find at rrii, > 10 eV. Fig. 
illustrates that a cutoff at any scale larger than 10 eV, 


say at rrii, « 30 eV, would make small neutrino masses 
too improbable for an anthropic explanation to work.) 


II. PREDICTIONS IN A LARGE LANDSCAPE 

If a theory has a large number of metastable vacua, 
most predictions will be statistical in nature. We are 
usually interested in understanding the magnitude of a 
particular parameter x, such as the cosmological constant 
or in the present case, the neutrino mass; hence we wish 
to compute a probability distribution ^ ^ . 

Fundamentally, the probability dV is proportional to 
the number of observations dN^hs that find the parame¬ 
ter to lie in the range (log cc, log a; -I- dlogx). Thus, our 
task is to compute dNo^s- This can be done by weighting 
a prior probability distribution f{x), which comes from 
the underlying theory, by the number w(x) of observa¬ 
tions that will be made in a vacuum where x takes on a 
particular value: 


dP 

dlogx 


'w{x)f{x) . 


(4) 


We will discuss each factor in turn. Our presentation in 
each subsection will be general at hrst, before specializing 
to the case of the neutrino mass, x = m^. 


A. Prior as a Multiverse Probability Force 


The prior is dehned by 


fix) 


dN^^ 


dloga; ’ 


(5) 


Here x will be a parameter in the effective theory at low 
energies whose scale log x one would like to predict or 
explain; diVvac = fix)dlogx is the number of long-lived 
metastable vacua^ in which the parameter takes on values 
in the range (log x, log x + d log x). 

With the notable exception of the cosmological con¬ 
stant, the prior distribution for most parameters is not 
well known. This is a technical problem: in the string 
landscape, f{x) should in principle be computable. In 
practice, it is difficult to derive phenomena far below the 
fundamental scale (the Planck or string scale) directly. 
However, this need not be an obstruction to progress. 


® Strictly, what matters is not the abundance of such vacua in the 
effective potential but in the multiverse: cosmological dynamics 
could favor the production of some vacua over others. For most 
low-energy parameters one expects that such selection effects are 
uncorrelated with x in the range of interest. In any case, we shall 
take the prior / to be an effective distribution that incorporates 
cosmological dynamics. 
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any more than the fact that we cannot derive the Stan¬ 
dard Model from a more fundamental theory prevented 
us from discovering it. 

Consider an arbitrary low-energy parameter x. In any 
large landscape the prior distribution f(x) should admit 
an effective description ES EH [sg. To avoid putting 
in the answer, one may assume that f{x) has no special 
features (such as a maximum) in a wide logarithmic range 
of values. This range should include but be much larger 
than the range compatible with observation. One can 
then parametrize the prior distribution by a “statistical 
pressure” or “multiverse probability force” towards large 
or smaller values, 


d\ogf 
dlogx ’ 


( 6 ) 


where n is approximately constant. For example, a flat 
prior distribution over log a: corresponds to n = 0. If the 
prior is flat over x, dN^g^^/dx = const., then n = 1. 

Suppose that there is a regime change sufficiently near 
the observed value loga;o: such that the number of ob¬ 
servers (or at least, of observers like us) w[x) drops dra¬ 
matically above or below a critical value logajc. Suppose 
for definiteness that Xq < x^- If the probability force 
favors large values of x, but not too strongly [n > 0, 
n ~ 0(1)], then the observed value can be explained. 
Similarly, with a negative probability force, one can ex¬ 
plain the proximity of x to nearby catastrophic boundary 
at some smaller Xc < xq. 

Recent successful examples of this approach include 
an explanation of the coincidence that dark and bary- 
onic densities are comparable 133 , the fine-tuning of the 
weak scale UBESlEn], and the comparability of several 
large, a priori unrelated timescales in cosmology |57j . In 
each case, the required assumption about the probabil¬ 
ity force is weak and qualitative: n ^ ±0(1). Thus 
phenomenological models of the landscape have signifi¬ 
cant explanatory value, while constraining the underly¬ 
ing prior distributions through the sign (and roughly the 
strength) of the probability force n. It is particularly 
instructive to keep track of the combination of (and pos¬ 
sible conflict between) forces rii needed to simultaneously 
explain multiple parameters Xi [STlEnilSI]. 

Now let us turn to the prior for the total neutrino mass, 
T’vac(wy). We know of no physical reason why a mini¬ 
mum neutrino mass should be necessary for observers. 
Hence, to obtain a normalizable probability distribution 
/, we must assume that the effective prior distribution 
favors large 


dV.. 


d log nil 


(X m" , n > 0 , 


(7) 


in some large logarithmic neighborhood of the observed 
value, ~ 0.1 eV. A natural and simple choice is n = 1, and 
we will use this value for definiteness in most plots. More 
generally, we will find that a comfortable range of values 
0 < n « 0(1) is consistent with an anthropic explanation 


of the neutrino mass, but not a value much greater than 
1 (see Fig. 1^. 


B. Anthropic Weighting 


The probability distribution over logx relevant for 
comparing the theory with observation is obtained by 
conditioning p on the presence of observers. More quan¬ 
titatively, one weights by the number of observations 

W{x) = dAobs/dAvac (8) 

that are made in a vacuum where x takes on a specific 
value. Generically, w{x) will be unsuppressed in a large 
region either above or below the observed value, or both. 
Thus, the anthropic factor is not doing all the work; the 
prior distribution is crucial for comparing the theory to 
observation. 

In this paper we will consider two different models for 
the number of observations w{x) that are performed in 
the universe. Both are based on the assumption that 
observers require galaxies, say of halo mass comparable 
to the Milky Way’s, 10^^ solar masses. The first model 
assumes that the rate at which observations occur in a 
given spatial region per unit proper time, w{x), is pro¬ 
portional to the total mass Mgai of such galaxies, at every 
instant; hence 

w{x) = J dt Mgsii{t) (Observer Model 1) . (9) 

The second model (which reduces to the choice made 
in [53]) assumes instead that the rate of observation is 
proportional to the rate M at which the above total 
galaxy mass grows: 

w{x) = J dt Mga,i{t) (Observer Model 2) . (10) 

The two models can be thought of as two different ap¬ 
proximations taken to an extreme. In the first, observa¬ 
tions would be made continuously in the galaxy, at fixed 
rate per unit stellar mass, no matter how old the stars 
become. In the second model, observations would occur 
instantaneously as baryons cool and form stars; no obser¬ 
vations would be assigned to a galaxy that is not growing. 
(The second model was used in Ref. [53] ; note that in the 
context of the measure used there the integral over time 
is trivial, yielding the collapse fraction Fn.) The truth is 
likely somewhere in between the two models. However, 
we will find that our results depend only weakly on the 
model, so we expect our conclusions to be robust. 


C. Measure 

A cosmology with at least one long-lived de Sitter vac¬ 
uum gives rise to eternal inflation: the universe will grow 




without bound and remain at finite temperature in ar¬ 
bitrarily large volumes at late times. Hence, all possible 
events will occur infinitely many times. This applies in 
particular to observations. Thus a regulator or “mea¬ 
sure” must be introduced to obtain a finite anthropic 
factor w{x). For this problem to exist, it is not necessary 
that the theory predict a large landscape; one de Sitter 
vacuum (such as, apparently, ours [521 IS3]) is enough. 
But the measure problem becomes particularly glaring 
in the landscape context: globally, every type of vacuum 
bubble is produced infinitely many times, and each bub¬ 
ble universe contains an infinite comoving volume. 

Existing analyses of the anthropic origin of neutrino 
masses preceded a period of significant progress on the 
measure problem of eternal inflation. Following Wein¬ 
berg |2], Refs. |S31 IMj regulate the divergences of the 
cosmological dynamics by estimating the number of ob¬ 
servers per haryon. This measure can no longer be con¬ 
sidered viable [531 [ 55 ]. Note, however, that our choice 
of measure is not responsible for the main differences be¬ 
tween our results and those of [ 53 ], as described at the 
end of Sec. HI 

In this paper, we will use the causal patch measure [4T| , 
which regulates eternal inflation by considering a single 
causally connected region and averaging over its possi¬ 
ble histories. This proposal is very generally defined, 
requiring only causal structure. It is also well moti¬ 
vated: it merely applies to cosmology an existing re¬ 
striction that was already needed for the unitary evap¬ 
oration of black holes [ 55 ] . Though proposed on formal 
grounds, the causal patch has met with phenomenologi¬ 
cal success; two examples are described in Appendix [Xj 
We take this as evidence that it approximates the correct 
measure well (at least in regions with positive cosmolog¬ 
ical constant [ 57 ]). 

A potential landscape is consistent with the observed 
cosmological history only if it is multi-dimensional with 
large energy differences between neighboring vacua |27] 
String theory gives rise to such a structure upon com- 
pactification to three spatial dimensions [T] , with AA not 
much below unity. 

The causal patch will contain a particular decay chain 
through de Sitter vacua in the landscape, ending with a 
big crunch in a vacuum with negative cosmological con¬ 
stant; each such chain is weighted by its probability, i.e., 
by the product of branching ratios [4T| . For a typical de¬ 
cay chain, none of the vacua will have anomalously small 
cosmological constant A ^ AA. Thus, after condition¬ 
ing on observers, there will be one vacuum with small 


^ The decay of our parent vacuum must release enough energy to 
heat our universe at least to the temperature of big bang nucle¬ 
osynthesis, which requires AA 1 (MeV)^. This is the reason 
why a multidimensional landscape is essential. One-dimensional 
“washboard” landscapes [na are ruled out, because they must 
have AA < 10“^® (MeV)^ so as to naturally include at least one 
vacuum like ours. 


cosmological constant in the causal patch, and we need 
only be concerned with how the causal patch regulates 
the volume of the corresponding bubble universe. 

Here we focus on the variation of the neutrino mass 
only, so we shall take this vacuum to be otherwise like 
ours. In particular we set the cosmological constant to 
the observed value, A ~ 10“^^^, and we take the spa¬ 
tial geometry to be flat. The metric is of the Friedman- 
Robertson-Walker (FRW) type: 

ds^ = —dt^ + a{t)‘^{dr^ + r^dfl^) , (II) 

where a is the scale factor, r is the comoving radius, t 
is proper time, and dfl^ is the metric on the unit two- 
sphere. 

By definition, the causal patch is the causal past of the 
future endpoint of a geodesic; thus its boundary consists 
of the past light-cone of such a point. We are interested 
in the boundaries of the causal patch during the time 
when a long-lived de Sitter vacuum still contains matter. 
A future decay has an exponentially small effect on the 
location of the patch boundary at much earlier times, 
so the patch can be computed by treating the vacuum 
as completely stable. The patch boundary is thus the 
cosmological event horizon. Its comoving radius at FRW 
time t is obtained by tracing a light-ray back from future 
de Sitter infinity: 

. ^ f°° dt' 

rpatch(i) - ^ . (12) 

The physical volume of the patch is 

475 Q Q 

Rphys(l) = -^a(f) Vpatch)!) ■ (13) 

As described in the previous subsection, we estimate 
the rate of observations per unit time as proportional to 
the total mass of all galaxies in the physical volume of the 
patch (for observer model 1), or to the rate of increase 
of this mass (for observer model 2). We can write this 
quantity as 

Mgal(<) = Pbcmphysit)FRit)GRit) . (14) 

The first two factors give the total mass of baryons 
and cold dark matter in the patch at the time t. The col¬ 
lapse fraction Fr is the fraction of this mass that is con¬ 
tained in halos of mass greater than correspond¬ 

ing to a comoving distance scale R: Mhaio = M^cFr. The 
galaxy fraction Gr is the fraction of this latter mass that 
represents baryons in galaxies, Mgai = MhaioGfl- 

Combining this with Eqs. @,(0 , and 0, the (unnor¬ 
malized) probability distribution over the neutrino mass 
is given by 

(xml j dt{rpa,tchaf PbcFRGR . (15) 


d log m, 
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in the first observer model; we replace FuGr by 
^{FrGr) for the second observer model.® Factors in 
the integrand may in general depend on both and t. 


III. CALCULATION OF dV/d\ogm^ 

A. Fixed, Variable, and Time-dependent 
Parameters 

We will consider a one-parameter family® of cosmolo¬ 
gies, differing from our universe only in the total mass 
of active neutrinos. More precisely, we consider two such 
families, since we treat the cases of normal and degen¬ 
erate neutrino hierarchy separately. Thus, we hold hxed 
all fundamental parameters other than TOj/. In partic¬ 
ular, we fix the vacuum energy density, pA = A/SttG, 
and the spatially flat geometry of the universe (imposed, 
presumably, by a mechanism like inflation that is uncor¬ 
related with m,^). We also hold hxed Xb = Pbln-j and 
Xc = Pc/n-y, the masses per photon of baryons and CDM. 
These quantities remain invariant under changes of 
since we hold hxed the fundamental processes that pro¬ 
duced the observed baryon and CDM abundances. 

For the actual values of these parameters, we use the 
Planck TT-flowP-t-lensing-|-ext best ht cosmological pa¬ 
rameters [33] ; see Table |l] The best ht assumes a neu¬ 
trino mass of about 0.06 eV [33], whereas strictly, one 
should use a best ht marginalized over nii, for the pur¬ 
poses of our paper. However, this has virtually no ef¬ 
fect on the hxed cosmological parameters such as pA, 
Xb, and Xc, because neutrinos are already constrained 
to contribute a very small fraction to the total den¬ 
sity. For example, the best-ht for the Hubble param¬ 
eter^ {Planck TT-|-lowP-l-lensing-|-ext [53]) shifts from 
67.9 ± 0.55 {mi, « 0.06 eV) to 67.7 ± 0.6 (marginalized 
over m^). This difference is negligible compared to cur¬ 
rent error bars and the discrepancies between different 
cosmological datasets. 


® Note that the time derivative should not be taken of the entire 
integrand, for this model. The loss of mass across the horizon due 
to the shrinking comoving volume of the patch does not produce 
“negative galaxies” inside the patch. At some cost in readability, 
we coul d h ave made this more explicit by defining the integrand 
in Eq. as the causal patch volume times the rate of change 
of the average physical density contributed by galaxies. 

® It would clearly be of interest to compute the probability distri¬ 
bution over several parameters including the neutrino mass; for 
examples of multivariate probability distributions in the land¬ 
scape, see e.g. m- Each additional scanning parameter is an 
additional opportunity to falsity the model. But already with 
one parameter scanning, one can falsify a model, in the usual 
way: by computing a probability distribution from the theory. If 
one finds that the observed value is several standard deviations 
from the mean, the model is ruled out at the corresponding level 
of confidence. 

^ Unless otherwise specified, we quote the Hubble parameter in 
units km s“^ Mpc“^ throughout. 


TABLE I: The cosmological parameters used in our calcu¬ 
lation, as well as the resulting mass per photon of baryons 
and CDM, xt and Xc- Tcmb is a Planck TT-|-lowP-|-BAO fit, 
while all others are from Planck TT-|-lowP+lensing-|-ext best 
fit values. We take fepWot = 0.05 Mpc“^. 


Parameter 

Value 

Tcmb 

2.722 K 

Ho 

67.90 


0.3065 

Da 

0.6935 


0.02227 


0.1184 

10® A, 

2.143 

Us 

0.9681 

Xb 

0.5745 eV 

Xc 

3.054 eV 


When considering entire cosmological histories, as we 
do, it is best to specify each cosmology in terms of time- 
independent parameters such as A, Xb, Xc, and m„. How¬ 
ever, we use Boltzmann codes such as CAMB and CLASS 
to compute power spectra wherever possible (i.e., for 
0 > 0). These codes expect input parameters that specify 
the cosmological model in terms of their present values, 
at redshift z = 0. It is not clear what one would mean 
by the “present” time in an alternate cosmology, but for 
the purposes of CAMB and CLASS, z = 0 is defined to 
be the time at which the CMB temperature takes the 
observed value, Tomb « 2.7 K. 

Thus we must derive the values of various time- 
dependent quantities at the time when the universe 
reaches this temperature, as a function of m,,, with other 
time-independent parameters fixed as described above. 
One finds for the Hubble parameter and the density pa¬ 
rameters 

H{m„;z = 0) = , (16) 

XXbcAvo J 

Plx{m„-,z = (S) = , Xe{6, c,A, ^}. (17) 

XbcAv 

Here multiple indices imply summation, for example 
Xbc = Xb + Xc- The fixed parameters Xb and Xc were 
defined above. The fixed parameter XA = PAln-y{z = 
0) is defined for notational convenience as the ob¬ 
served vacuum energy per photon at the present ob¬ 
served CMB temperature. The ruj^-dependent parame¬ 
ter Xi/(m.iy) = is the neutrino mass per photon. 

Flo = FI{0.06 eV', z = 0) and Xi^o — Xi'(0-06eV) are ob¬ 
served values, corresponding to the Planck best fit base¬ 
line model. 
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FIG. 6: The comoving volume of the causal patch for irti, = 0 
(black), = 4 (blue), and rrii, = 8 (red). 


B. Homogeneous Evolution 


For computing the volume of the causal patch, the fac¬ 
tor (rpatcho)^ in Eq. (15), we will need to know the scale 
factor. Unless structure is present, the integrand will 
be suppressed by the Press-Schechter factor hence 
it suffices to use an analytic solution valid to excellent 
approximation in the matter and vacuum eras: 


t) = 


cot A sinh 


3 sin A 


H.t 


2/3 


(18) 


The solution depends on rrii, through = 0) = 

sin^ A. 


Since Xbc does not depend on and p^c = n^Xbc, 
Pbciz = 0) does not depend on m^. Moreover, since the 
scale factor in Eq. ( |I^ is normalized so that a = 1 at 
z = 0, we have phc{tj = Pbc{z = 0)/a{t)^ for all values of 
m,^. Thus Eq. ([T^ simplifies to 


dV 

d log nil, 


oc m" 


dt J'patch FrGr , 


(19) 


where Cpatch is given by Eq. (12), and we are dropping 
mi^-independent normalization factors as usual. 

The comoving volume of the causal patch is shown in 
Fig.[§ We note that already at the homogeneous level, 
a nonzero neutrino mass is slightly disfavored because it 
decreases the size of the causal patch. We also note that 
the patch size is maximal at early times and decreases 
rapidly. Hence galaxies that form very late effectively 
fail to contribute to the probability for a given parameter 
value. 


C. Halo Formation 


The next factor in Eq. (19) is the collapse fraction 


FR{m,y,t). It captures the effects of neutrinos on struc¬ 
ture formation: recall that Er is defined as the fraction 


of baryonic and cold dark matter that is contained in ha¬ 
los of mass IO^^Mq or greater. It captures the effects 
of neutrinos on structure formation. Recall that Fr is 
defined as the fraction of baryonic and cold dark matter 
in virialized halos of mass scale IO^^Mq or greater. This 
corresponds to a comoving distance scale i? ~ 1.8 Mpc.® 
The collapse fraction can be determined using the 
Press-Schechter formalism [68j . Before nonlinearities are 
important, the density contrast® 6{x,t) smoothed on a 
scale R has a Gaussian distribution, 

V{S,t) dSexp dS , (20) 

with standard deviation Eluctuations that exceed 

a certain threshold <5* ~ 0(1) in the linear analysis will 
have become gravitationally bound. Hence, 

= f . (21) 

We use the canonical value = 1.69, which is obtained 
by comparing the linear perturbation to a spherical col¬ 
lapse model. 

The standard deviation of the smoothed density con¬ 
trast is given by m 

4 ^ (4(^)) 1 (22) 


with 

5r = y (i®x'i5(x) Hr(|x- x'l) , (23) 


where <5(x) = Spcjpc is the fractional overdensity of 
cold dark matter. We use the top hat window function, 
Wii{x) = 1 for |x| < R and Wii{x) = 0 otherwise. 

Equivalently, the smoothed density contrast can be 
computed from Fourier-transformed quantities: 


2 dk k^PUk) 

k 


\WR{k)f , 


(24) 


® The comoving scale R is independent of rrii^ because pbc(^ = 0) 
is. However, when expressed in units of Mpc/h it depends on 
rrii, through Eq. (|16||. 

^ We use the CDM density contrast and power spectrum to com¬ 
pute the Press-Schechter factor F. This matches A^-body simula¬ 
tions better than using the full matter density contrast including 
neutrinos m- It is also a conservative choice, since the total 
matter power spectrum is further suppressed at large mu, by a 
factor (1 — fu)^ below the free streaming scale. 

For structure that forms in the vacuum era, the collapse thresh¬ 
old is slightly lowered [53], whereas in the presence of an appre¬ 
ciable neutrino fraction 5* should be slightly increased [7Q| . If 
we adapted <5* accordingly, the net effect would be to further 
suppress structure at large m^, in favor of an anthropic origin 
of the neutrino mass. However, appropriate values of 5* have so 
far been estimated only for rather small neutrino masses. Ulti¬ 
mately, it would be preferable to sidestep the Press-Schechter ap¬ 
proximation altogether. Our analysis could be dramatically im¬ 
proved by using proper A^^-body simulations to compute structure 
formation, including an adequate treatment of baryonic physics. 
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FIG. 7: The Press-Schechter factor (solid lines) and its deriva¬ 
tive (dashed lines) at the galaxy scale, for a nor¬ 

mal hierarchy with m,^ = OeV (black), m,^ = 2eV (brown), 
= 4eV (blue), rrii, = 6eV (purple) and rriv = 8eV (red). 
Each is used to define either of the two observer models of 
Sec. m Note that massive neutrinos suppress structure at 
all times, but much more so at early times |72H77 |. 


where W/j(fc) = kRcoskR). The CDM 

power spectrum is defined by 

(j(k)5(k')) = (2TTfPce{k)S\k - k') , (25) 

where <5(k) is the Fourier transform of S(x) and <5^ is the 
Dirac delta function. 

To evaluate we use the GAME code [42] to 

compute the CDM power spectrum Pccik) as a function 
of time, in models with different neutrino mass. We eval¬ 


uate the integral in Eq. (241 numerically. We have also 
checked our results using the CLASS code [13]. We no¬ 
ticed a small discrepancy in the output of k^Pcc at the 
largest neutrino masses we consider, rrii, ~ 10 eV, where 
CLASS gives a slightly larger amplitude for the free- 
streaming peak. By lowering the cutoff on described 
in Sec. [Ill D[ the CLASS output would only strengthen 
the anthropic explanation of the observed neutrino mass 
range. 

Available Boltzmann codes do not return power spec¬ 
tra for negative redshifts, that is, for times when the 
CMB temperature is below 2.7 K. In this regime only, 
we estimate an by extrapolating our numerical results to 
negative redshifts semi-analytically as described in Ap¬ 
pendix |B 2[ This regime is not a dominant contributor 
to the overall probability distribution, due to the small¬ 
ness of the causal patch at late times, and since vacuum 
domination terminates structure for mati on in any case. 

We compute Fr and Fr from Eq. (21); the results are 
shown in Fig. 


D. Galaxy Formation: Neutrino-Induced Cooling 
Catastrophe 


The final factor GR{m,^,t) in Eq. (19) is the fraction 
of the halo mass in baryons within galaxies. To approx¬ 


imate this, we must first investigate the effect of a top- 
down structure scenario (present at rui, > 8 — 10 eV, as 
discussed in Sec. [I]) on galaxy formation. 

In our universe galaxies form in halos with masses be¬ 
tween IO^Mq and IO^^Mq. Larger halos can inherit 
galaxies from mergers, resulting in galaxy groups and 
clusters, with masses ranging from 10^^Mq to 10^^Mq. 
However, halos in the latter mass range do not them¬ 
selves produce a significant amount of stars, relative to 
their total mass. 

This fact can be understood as a consequence of the 
ability, or failure, of baryons to cool rapidly inside newly 
formed dark matter halos. (For more detail, see Ap- 
pendixj^and references given there.) Baryons are shock- 
heated to a virial temperature when they fall into a 
large dark matter halo. In order to condense into a galaxy 
at the center of the halo, the baryons must first shed their 
thermal energy. Cooling can occur by bremsstrahlung 
at temperatures large enough to ionize hydrogen, or by 
atomic and molecular line cooling at the lower tempera¬ 
tures attained in smaller halos. 

Analytically, one can estimate the time it takes baryons 
to cool, tcooi- The cooling time grows with the mass 
of the halo (for large masses), and with the time of its 
formation. It is also easy to compute the gravitational 
timescale of the halo, tgrav, which is somewhat shorter 
than the time of its formation. 

A good match to observation is obtained by the follow¬ 
ing criterion. If tcooi < igrav, then cooling is efficient. A 
significant fraction of baryons (up to 10%) is converted 
into stars. This process occurs rapidly, on a timescale 
that can be treated as instantaneous compared to the 
age of the universe when the halo formed. 

On the other hand, if tcooi > ^grav, then star forma¬ 
tion is limited by the cooling time. In this regime, one 
would still expect a certain amount of rapid star forma¬ 
tion at the dense core of the halo, but this is not seen in 
observations. (This is known as the cooling flow prob¬ 
lem.) Observations do not constrain the possibility that 
a significant portion of baryons will form stars in the dis¬ 
tant future, on a timescale much greater than the age of 
the universe. This time would greatly exceed t\. Since 
the causal patch is of a fixed physical size of order the 
de Sitter horizon scale, there will be exponentially few 
halos left in it at late times. Thus, star formation at 
very late times does not contribute to the probability 
of a particular universe. (This sensitivity to the mat¬ 
ter content inside the cosmological horizon is a key fea¬ 
ture distinguishing the causal patch from other interest¬ 
ing measures, such as the fat geodesic or scale factor time 
cutoff m, and it is responsible for several of the chief 
successes of the causal patch, e.g. gTlISTlIMHl]-) Thus, 
we may take tcooi < ^grav as a robust condition for galaxy 
formation to occur in a newly formed halo. 

The cooling function that determines the rate of heat 
dissipation has a complicated form in the relevant halo 
mass range (see [85] and references therein). Appendix [C| 
describes two different approximations to tcooi and tgrav 
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that capture different cooling regimes that halos in our 
analysis might explore. One finds in either regime that 
at late times, cooling is inefficient for halo masses above 
the scale of the Milky Way halo: 


Mvir > IO^^Mq , U 


> 


O(Gyr) 


No Galaxy (26) 


Importantly, the boundary is consistent with the obser¬ 
vation that in our universe, there are no galaxies much 
larger than the Milky Way. 

It would be interesting to implement a more precise 
version of the above boundary as a cutoff on the time 
until galaxy formation is efficient, at any value of m^. 
Massive neutrinos delay structure formation more dra¬ 
matically than they suppress it (Fig. [^, so such a cut¬ 
off would exclude an appreciable fraction of halos from 
contributing to galaxy formation even at rather small 
m^. Thus it would lead to a greater suppression of in¬ 
termediate neutrino masses between 1 and 10 eV, and 
thus would favor the anthropic approach. Instead, we 
will argue more conservatively for a cooling cutoff on m^, 
around 10 eV. We will now identify a change of regime 
for nil, ^ 10 eV. As we shall see, this transition places 
the dominant halo population so far into the regime of 
inefficient cooling, that the above rough estimate suffices 
to conclude that galaxy formation is highly suppressed. 

For 8 — 10 eV, recall that the dimensionless 

matter power spectrum k^Pcc{k) increases monotonically 
with k (see Fig. and the integral for the smoothed 
density contrast an in Eq. (24) is dominated by the power 


at the small galactic scale R. In this range, the power 
spectrum preserves the standard hierarchical structure 
formation we see in our universe, where low mass halos 
generally form earlier than more massive ones. Thus, it 
is not likely for a IO^^Mq halo to be nested inside a more 
massive overdensity that collapses at the same time. 

Above rriiy « 8 — 10 eV, neutrinos suppress small scale 
power so much that the dimensionless power spectrum 
k^Pcc{k) develops a maximum near the scale associated 
with free streaming fcnr (Fig. This corresponds to a 
mass of order 5 — 100 times the scale of the Milky Way 
halo, roughly the scale of galaxy clustersd^ It implies 
that the smoothed density contrast on small scales such 
as IO^^Mq is no longer dominated by the power at the 
corresponding wavenumber k. Instead, the integral in 
Eq. (241 is dominated by the maximum of the integrand, 


near kr. 


The peak (the free streaming scale) moves to smaller scales as 
TTiiy is increased. Eventually it crosses the galaxy scale: for rrii/ > 
100 eV neutrinos act as cold dark matter. But this does not 
yield an anthropically allowed region, because the dark matter to 
baryon density ratio 4 will be too large. This may be detrimental 
to disk fragmentation [481153| . If the causal patch is used, 4^1 
is robustly suppressed independently of any effects on galaxy and 
star formation, because the total mass of baryons (and thus of 
observers) in the patch scales like (1 + ^) -1 m- 


This implies that IO^^Mq overdensities become grav¬ 
itationally bound at the same time as overdensities on 
larger scales: a top-down scenario. The virial temper¬ 
ature and cooling time will be set by the largest scale 
that the 10^‘^Mq overdensity is embedded in, Mvir ^ 
lO^^M©. Moreover, for such large halos virialization will 
occur quite late (see Fig. [^, tvir 3> 5.3 Gyr. Hence, 
for TOj/ ^ 8 — 10 eV, the cooling condition in Eq. (26) 
becomes violated, by a substantial margin. 

Note that this conclusion is insensitive to the halo mass 
scale we associate with observers. Whether we require 
10^^Mq or lO^^Af 0 halos: if the power spectrum peaks at 
larger scales, the putative galactic halos will be embedded 
in and virialize together with perturbations on a mass 
scale well above IO^^Mq, leading to a cooling problem. 


Let us summarize these considerations and formulate 
our cooling cutoff on the neutrino mass. If there ex¬ 
ists some large scale fc* < kg^i such that k^Pcdk,,) > 
^gal^cc (fcgai), we interpret this as indicating top-down 
structure formation. Let be the greatest neutrino 

mass sum for which this criterion is not met, i.e., the 
largest neutrino mass compatible with bottom-up struc¬ 
ture formation. From Boltzmann codes we find = 

7.7 eV for the normal hierarchy and = 10.8 eV for 
the degenerate hierarchy. We have argued that cooling 
fails substantially in the top-down regime, because the 
first virialized halos are large and form late. Hence, we 
treat as a sharp catastrophic boundary. We ap¬ 

proximate Gn as a step function that vanishes past this 
critical mass: 


Gnim^.t) = 


1 , < m” 

0 , mu>ml 


(27) 


We evaluate the integral in Eq. (19) numerically using 


Mathematica. The integration is started before structure 
begins to form, at redshift z = 12 , when Fn is negligible. 
The integration is terminated deep in the vacuum era 
when Tpatch becomes exponentially small. Our final result 
is described in Sec. [I) see Figures and 
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Appendix A: Cosmological Constant and the Causal 

Patch 

The cosmological constant offers a nice example of the 
predictive power of a large landscape, and it also illus¬ 
trates the advantages of the causal patch measure over 
competing proposals. In this appendix we review Wein¬ 
berg’s 1987 prediction of a positive cosmological con¬ 
stant [5], which has since been confirmed by observa¬ 
tion [62l |63] . We then turn to the more recent success of 
the causal patch measure in improving the quantitative 
agreement with the observed magnitude of A > 0 (par¬ 
ticularly in settings where the primordial density con¬ 
trast is also allowed to vary), while eliminating specific 
anthropic assumptions. The goal is to make contact be¬ 
tween an example many readers will be familiar with, 
and the more general formalism for making predictions 
in the landscape described in Sec. [TTl 


Thus, the resulting distribution 7^(log A) = wf peaks 
around x ~ — 21ogtgai. V is suppressed at larger val¬ 
ues of X due to the anthropic factor w, and at smaller 
values of x because the prior probability / is low. The 
model, proposed by Weinberg in 1987, thus predicted 
a nonzero cosmological constant not much smaller than 
Pnl- Just such a value has since been discovered |5^IB5] . 
The model could have been ruled out at any level of con¬ 
fidence if, instead of a detection, the observational up¬ 
per bound on A had continued to improve, moving ever 
deeper into the region suppressed by the prior. 

Weinberg’s argument had a few shortcomings, which 
we list here. First, the approach actually favors a some¬ 
what larger value of A; the observed value is small at 
2 — 3(7 depending on the assumptions made about the size 
of galaxies required by observers. More concerningly, the 
approach would not appear to be robust against varia¬ 
tions of the initial density contrast Q. It strongly favors 
vacua in which both Q and A are larger than the observed 
values, unless the prior for Q favors a small magnitude, 
or unless there is a catastrophic boundary very close to 
the observed values of Q. Neither of these arguments are 
easy to make. 

2. Causal Patch Prediction: A ~ 


1. Weinberg’s Prediction: A ~ 

Because A = 0 is not a special value from the point 
of view of particle physics, the prior distribution over 
the cosmological constant A should have no sharp feature 
near A = 0; hence to leading order in a Taylor expansion, 
dNyac/dA « const, for |A| ^ 1.^^ Hence we have 

7^vac(A) oc A = exp(logA) : (Al) 

the prior favors large magnitude of the cosmological con¬ 
stant. So far, this is just a restatement of the cosmo¬ 
logical constant problem in a landscape setting: among 
many (nonsupersymmetric) vacua, most will tend to have 
large A, since precise cancellations between the positive 
and negative contributions to A are unlikely. 

For A > 0, structure formation would be severely di¬ 
minished if A was large enough to dominate over the mat¬ 
ter density of the universe before the time tgai when den¬ 
sity perturbations on the scale of galactic haloes would 
otherwise become nonlinear. (For negative A of sufficient 
magnitude, the universe recollapses too soon.) Crudely, 
the weighting factor w{x) may be approximated as van¬ 
ishing for A > pnl and constant for A < pnL) where 
Pvir is the energy density at that time [5]. A re¬ 

finement |46j models w^x) as the fraction of baryons that 
enter structure of a specified minimum mass. 


In this Appendix we work in Planck units, G = h = 1. 


In much of the older literature, the divergences of eter¬ 
nal inflation were regulated by computing the number of 
observers per baryon. (See the beginning of Sec. IIC for a 
brief discussion of the measure problem, and Ref. m for 
a review.) This was a reasonable first guess, particularly 
in the context of a landscape where only the cosmological 
constant varies. However, it is no longer viable in light 
of more recent insights [BH [65] . 

The ratio is not well-defined in a landscape where 
some vacua may not contain any baryons. Worse, it 
does not actually regulate all infinities, since a long-lived 
metastable vacuum with positive cosmological constant 
(such as ours) will have infinite four-volume in any co¬ 
moving volume; hence, an infinite number of observers 
“per baryon” will be produced by thermal fluctuations 
at late times. The number of measures that are well- 
defined and not clearly ruled out is surprisingly small, 
and the causal patch measure has had the greatest quan¬ 
titative success so far (at least [B7| when we are interested 
in relative probabilities for events in vacua with positive 
cosmological constant, as we are here). Here we give two 
examples. 

First let us recompute the probability distribution over 
the cosmological constant, dV/dlogK with A > 0 using 
the causal patch. We consider a class of observers that 
live at the (arbitrary but fixed) time fobs! for comparing 
with out observations, we will choose fobs = 13.8 Gyr. 
But the causal patch at late times coincides with the 
interior of the cosmological horizon. Because of the ex¬ 
ponential expansion, the average density decreases like 
g-3t/tA_ jf > tA ~ A-1/2 ^ 0(10) Gyr, no ob- 
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servers will be present in the patch, no matter whether 
or not galaxies form. This is a much more stringent cutoff 
than the suppression of galaxy formation which only sets 
in for a larger value of A, such that tgai ^ ^a- It agrees 
very well with the observed value of A, resolving the mild 
(2—3cr) tension with Weinberg’s estimate. It is unaffected 
by any increase in the primordial density contrast, since 
^obs contains Gyr time scales that are not shortened by 
hastening structure formation. It solves the “Why Now” 
problem directly. And it does all this without making 
any specific assumptions about the nature of observers, 
except that they are made of stuff that redshifts faster 
than vacuum energy. (However, in the present paper we 
do assume that observers require galaxies.) 

The causal patch can also explain why dark and bary- 
onic matter have comparable abundances: the “Why 
Comparable” coincidence. One makes the qualitative as¬ 
sumption that the dark-to-baryonic density ratio f favors 
large values. But when C I, tbe causal patch sup¬ 
presses baryonic observers by a factor 1/(1 — ^), which 
counteracts the prior distribution, leading to the predic¬ 
tion that C 0(1) [47]. 


Appendix B: Structure Formation with Neutrinos 

Our calculation was done almost entirely using Boltz¬ 
mann codes, not analytic approximations. However, for 
completeness we summarize here the physical origin of 
the effects of neutrinos on structure formation. In the 
final subsection |B 2[ we explain the semi-analytic extrap¬ 
olation formula we have used to extend the code output 
to negative redshifts. For excellent in-depth treatments 
of neutrino cosmology, see Refs. |4il H5] . 


1. Neutriuo Cosmology 


Around a second after the big bang at the time of 
decoupling, neutrinos are frozen out with a Fermi-Dirac 
distribution whose temperature is set by the primordial 
plasma. Due to annihilations that heat up the plasma 
soon after neutrino decoupling, this temperature differs 
from the temperature of the CMB, which decouples from 
the plasma much later: = (4/11)^/^ Tcmb = 1.95 K. 

The energy density and pressure of a single neutrino 
with mass m at a fixed time since decoupling is thus 
approximately given by 


Pu = 2 


/ 


P. = 2 


/ 


cPp \l-t- m? 

(27r)3 -I- 1 ’ 

(ffp p'^ 1 

(27r)3 3^/p^T^ -b 1 ’ 


(Bl) 

(B2) 


where T^{z) = T/,o(l + is the neutrino temperature as 
it redshifts from the value set at decoupling. 


At early times, neutrinos contribute as radiation and 
add to the total radiation density as 


Pb. = 


1 + 


7 

8 



P 7 j 


(B3) 


where 


Pi — ) (II4) 

and where = 3.046 is the effective number of neu¬ 
trino species, with a slight deviation from 3 due to non- 
thermal spectral distortions from the annihilations. 

Similarly, the number density of neutrinos per species 
is set by the CMB number density: 

3 

n„ = —n^ , (B5) 

where 

„ _ 2C(3) ^3 i-oa\ 

rij — ^^CMB ■ (Bh) 

TT 

Neutrinos become approximately non-relativistic once 
their thermal energy drops below the relativistic kinetic 
energy, ‘iT^{z) < which occurs at a redshift Znr of^^ 

l + z„, = 199l(^) . (B7) 

Well after this transition, the density of non-relativistic 
neutrinos asymptotes to 


p,y = , (B8) 

where mi, is the sum of masses of all non-relativistic neu¬ 
trino species. In terms of this, the neutrino density pa¬ 
rameter counting only massive neutrinos is 

II. = - , (B9) 

where p* is the critical density defined by = 87rGp*/3, 
which gives 


The neutrino free streaming scale is set by the typical 
distance neutrinos travel thermally up to a given time. 
Roughly, it is given by the horizon scale at early times 
and stops growing soon after the neutrinos become non- 
relativistic; hence it can be crudely approximated by the 
horizon scale at the nonrelativistic transition, km,. 


^3 The non-relativistic transition is far from sudden. The neutrino 
pressure Eq. | |B2[ | has a non-negligible tail long after the redshift 
Eq. | |B7| |, which smears out the transition. We thank J. Lesgour- 
gues for explaining this point to us. 
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Ga{x) 



FIG. 8: The growth factor Eq. ( B14| | (solid line), which 
behaves like (dashed line) during the matter era, and 
asymptotes to a constant value well above xittC) = 1. 


On small scales, there are two effects by which neutri¬ 
nos suppress structure. The most obvious is that density 
perturbations will be washed out. Thus, free stream¬ 
ing eliminates the contribution of neutrinos to structure, 
and thus suppresses the total matter power by a factor 
~ (1 - where 


U = 




(Bll) 


defines the massive neutrino fraction. Conversely, on 
larger scales neutrinos will remain confined to the over- 
dense regions and will behave like cold dark matter. 

A secondary but more important effect is that the den¬ 
sity of massive neutrinos contribute via the Friedmann 
equation to the Hubble parameter, which controls the 
friction term in the growth of matter perturbations. But 
on short scales, they do not contribute to the source term 
(the density contrast). Therefore, CDM perturbations 
grow more slowly in the presence of a nonclustering mat¬ 
ter component on short scales m- 


results for cfR{z) from positive to negative z, i.e., from 
a < 1 to a > 1. The most straightforward approach 
would be a linear extrapolation in some time variable, 
fitting both the value and the derivative of gr at z = 0. 
However, there is a physical effect that we must incorpo¬ 
rate analytically: vacuum domination turns off structure 
growth on all scales. This effect is not strong enough 
at z = 0 to have a significant imprint on the value or 
time derivative of an. However, the effect is also rather 
simple, and thus easy to incorporate analytically. 

In a universe with negligible neutrino mass, the CDM 
density contrast grows as [151 [HH] 


S oc Ga(x) = - 
6 


1-b 


dy 


X Jo yi/6(l + y)3/2 


where 


x = — = 
Pm 


^rrj 


(1 + ^) 


-3 


(B14) 


(B15) 


z=0 


As seen in Fig. density perturbations grow like the 
scale factor during the matter dominated era; they 
asymptote to a constant value at times t > t\. 

With nonzero neutrino mass, a reasonable approxima¬ 
tion is obtained by combining the analytic result for the 
matter era, Eq. (BI2), with the = 0 transition to the 


vacuum dominated era: 


6 oc Ga{x) {k < km) , (BI6) 

6 oc Ga{x)p [k > km) ■ (BI7) 


Recall that Pcc{k) oc 5c{k)^ by Eq. (25). 


In order to improve on this result, we can incorporate 
the information gained from the use of Boltzmann codes. 
Instead of computing p and fcps analytically as described 
in the previous subsection, we can read off a slope p{k) 
from the numerical output near z = 0: 


I dlogPUx) 

’ ~ 2dlogGA(x) ■ 


(B18) 


dc oc a , /c < km , 

dc oc , k > km , (B12) 

where 


We can also fix the constant of proportionality G by 
matching the magnitude of Pec obtained from CAMB at 
z = 0. This yields a semi-analytic power spectrum as a 
function of time, for any fixed k and fixed neutrino mass: 


p = 


— 1 -l- \/l -k 24(1 — f^) 


5 


(B13) 


with the last approximation valid in the limit of small 
neutrino masses. 


2. Late-Time Extrapolation of Numerical Results 

Available Boltzmann codes do not offer output for neg¬ 
ative redshifts. In order to estimate the smoothed density 
contrast an in this regime, we extrapolate our numerical 


Pec(x) = GGa(x)2p('=) . 


(B19) 


In practice, it is cumbersome to extrapolate the power 
at each wave number only to integrate over scales to ob¬ 
tain the smoothed density contrast. By the late time cor¬ 
responding to z = 0, for any neutrino mass, we expect 
that the integral in Eq. (24) is dominated by the power 


at some scale k and will remain dominated by the same 
scale in the future (z < 0). For small neutrino masses, 
this scale will be set by the galaxy scale; for large rrii, 
it will be the scale of the peak of the spectrum k^P(k). 
We incorporate this by matching the analytic growth for 
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(«) Ceff 



{b) PeS 


(eV) 



(eV) 


FIG. 9: The parameters Gefr and Peff, for normal (orange, top) and degenerate (green, bottom) hierarchies, obtained by fitting 
Eq. (B22| to CAMB output for an and its derivative at z = 0. The resulting htting function for fyR{z) is used to compute the 
Press-Schechter factor at negative redshift only. Note that peff ~ 1 throughout. This may seem surprising, but it is consistent 
with our earlier finding that at large neutrino masses, the scales whose power contributes dominantly to an are precisely the 
ones on which free-streaming is not effective. This is closely related to the discrepancy we hnd with Ref. [53], whose estimate 
Peff ~ p(fcgai) ~ 1 — 8/i/ would yield a monotonically decreasing curve in (b). 


z < 0 directly to the numerical results for an^x) at z = 0. 
For every m,^, we compute 


Pee 


Cee 


dlog (Jr{x) 
dlogG^iyx) ^^0 

Ga{x)p-« ^^0 


(B20) 

(B21) 


where is the mass of the halo and i?vir is its virial ra¬ 
dius. In the regime of interest for us, T^ir is large enough 
to ionize hydrogen. Then one can take the average molec¬ 
ular mass fi to be mp/2, where rup is the mass of the 
proton. With Mvu = ^Pvir.Rvir finds 


Tvi, cx , 


(C2) 


from the CAMB output for small nonnegative redshifts. 
The results are shown in Fig. 

As our semi-analytic approximation entering the Press- 
Schechter factor F for z < 0 we use 


ct_r(z) = C'effGA(a;(z))^'“ [used for z < 0 only] (B22) 


with Ga given by Eq. (B14). We have checked that the 
same formula provides an excellent ht to the numerical 
results at z > 0, as one would expect. However, we stress 
again that we use the output from the CAMB code in this 
regime, not the fitting function. Moreover, the regime 
z > 0 dominates in our calculation because the comoving 
volume of the causal patch decreases rapidly below z = 2. 


Appendix C: Cooling and Galaxy Formation 


where the “constants” of proportionality depend negligi¬ 
bly on Mvir. 

The timescale for cooling by bremsstrahlung is 

tbrems CX ^ (X ^ . (C3) 

Pvir 

We will be interested in how this timescale compares to 
the age of the universe when the halo virializes, 

4ir cx . (C4) 

If tbrems Gir, then galaxy formation can be treated 
as instantaneous, i.e., as occurring nearly simultaneously 
with halo formation. Keeping track of all constants m, 
one finds that this case corresponds to 


In this Appendix, we review the basic time scales that 
are believed to control cooling flows in dark matter halos. 
Our discussion closely follows Ref. where further 

details and references can be found. 

Baryonic gas will fall into the gravitational well of 
newly formed dark matter halos. The baryons are thus 
shock-heated to high temperatures. In order for stars 
to form, the baryonic gas must cool and condense. The 
initial temperature of the baryons is called the virial tem¬ 
perature. By the virial theorem. 


GA?vir/r 

5i?vir 


(Cl) 


Mvirtvir < (10i2Mo)(2.2Gyr)2 . (C5) 

In the opposite case, ^ (lO^^Ar0)(2.2Gyr)^, we 

have tbrems ^ Gir- In halos with these mass and viri- 
alization time combinations, galaxy formation cannot be 
treated as instantaneous. Instead, it takes a much greater 
time tbrems ^ ^vir to convert a comparable fraction of 
baryons into stars. (If feedback or major mergers dis¬ 
rupt the cooling flow, the contrast would be even more 
drastic, but we will not assume this here.) 

The above analysis assumed cooling of unbound 
charged particles by bremsstrahlung. This approxima¬ 
tion is best for virial temperatures above 10^ K. At lower 
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temperatures the cooling function is quite complicated, 
but one can get an estimate by treating it as independent 
of Tvir in some range [SU]. With this approximation, one 
obtains that the cooling condition is satisfied for 

< (lOi2M0)"(5.3Gyr) . (C 6 ) 


With either scaling, one finds again that cooling is ineffi¬ 
cient if Mvir > 1 O^^M 0 , particularly for late virialization 


^vir > 10 Gyr. 

So far, we have neglected the effects of the cosmological 
constant. For halos that form deep in the vacuum domi¬ 
nated era, one should use pvir ~ PA instead of Eq. (C41. 
But such halos contribute negligibly in the causal patch 
because they will be exponentially dilute. 

We have also neglected neutrinos. However, Eq. (C5) 


is sufficiently general to capture their main effect, which 
is to change the relation between Mvir and tvir- In a 
universe with ^ 8 eV, tvir grows logarithmically with 
Mvir for overdensities of a fixed relative amplitude. For 
IO^^Mq halos forming from Icr {2a) overdensities, tvir ~ 
3.6Gyr (tvir ~ 1.3Gyr) and by Eq. (C5), cooling fails 
(succeeds). 

In a universe with > 8 eV, however, small scale 
power is so suppressed that structure formation proceeds 
in a top-down manner. (This is shown in detail in the 
main text.) Then structure on all scales forms much 
later than 2.4 Gyr. Moreover, smaller structure is embed¬ 
ded in larger halos, which set the virial mass that enters 
Eq. (G5). Hence, the timescale for a significant fraction of 
baryons to form stars is at least tbrems Gir O(Gyr). 
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